Private Sampling: A Noiseless Approach for Generating Differentially Private Synthetic Data
نویسندگان
چکیده
In a world where artificial intelligence and data science become omnipresent, sharing is increasingly locking horns with data-privacy concerns. Differential privacy has emerged as rigorous framework for protecting individual in statistical database, while releasing useful information about the database. The standard way to implement differential inject sufficient amount of noise into data. However, addition other limitations privacy, this process adding will affect accuracy utility. Another approach enable based on concept synthetic goal create an as-realistic-as-possible dataset, one that not only maintains nuances original data, but does so without risk exposing sensitive information. combination been suggested best-of-both-worlds solutions. work, we propose first noisefree method construct differentially private data; do through mechanism called sampling. Using Boolean cube benchmark model, derive explicit bounds constructed key mathematical tools are hypercontractivity, duality, empirical processes. A core ingredient our sampling marginal correction method, which remarkable property importance reweighting can be utilized exactly match marginals sample population.
منابع مشابه
Differentially Private Local Electricity Markets
Privacy-preserving electricity markets have a key role in steering customers towards participation in local electricity markets by guarantying to protect their sensitive information. Moreover, these markets make it possible to statically release and share the market outputs for social good. This paper aims to design a market for local energy communities by implementing Differential Privacy (DP)...
متن کاملGenerating Differentially Private Datasets Using GANs
In this paper, we present a technique for generating artificial datasets that retain statistical properties of the real data while providing differential privacy guarantees with respect to this data. We include a Gaussian noise layer in the discriminator of a generative adversarial network to make the output and the gradients differentially private with respect to the training data, and then us...
متن کاملGenerating Differentially Private Datasets Using Gans
In this paper, we present a technique for generating artificial datasets that retain statistical properties of the real data while providing differential privacy guarantees with respect to this data. We include a Gaussian noise layer in the discriminator of a generative adversarial network to make the output and the gradients differentially private with respect to the training data, and then us...
متن کاملPCPs and the Hardness of Generating Private Synthetic Data
Assuming the existence of one-way functions, we show that there is no polynomial-time, differentially private algorithm A that takes a database D ∈ ({0, 1}) and outputs a “synthetic database” D̂ all of whose two-way marginals are approximately equal to those of D. (A two-way marginal is the fraction of database rows x ∈ {0, 1} with a given pair of values in a given pair of columns.) This answers...
متن کاملDifferentially Private Trajectory Data Publication
With the increasing prevalence of location-aware devices, trajectory data has been generated and collected in various application domains. Trajectory data carries rich information that is useful for many data analysis tasks. Yet, improper publishing and use of trajectory data could jeopardize individual privacy. However, it has been shown that existing privacy-preserving trajectory data publish...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SIAM journal on mathematics of data science
سال: 2022
ISSN: ['2577-0187']
DOI: https://doi.org/10.1137/21m1449944